Mental States as Macrostates 
Emerging from EEG Dynamics 
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Abstract 

Correlations between psychological and physio- 
logical phenomena form the basis for different 
medical and scientific disciplines, but the nature 
of this relation has not yet been fully understood. 
One conceptual option is to understand the mental 
as "emerging" from neural processes in the spe- 
cific sense that psychology and physiology pro- 
vide two different descriptions of the same sys- 
tem. Stating these descriptions in terms of coarser- 
and finer-grained system states (macro- and mi- 
crostates), the two descriptions may be equally ad- 
equate if the coarse-graining preserves the possi- 
bility to obtain a dynamical rule for the system. 
To test the empirical viability of our approach, we 
describe an algorithm to obtain a specific form of 
such a coarse-graining from data, and illustrate 
its operation using a simulated dynamical system. 
We then apply the method to an electroencephalo- 
graphic (EEG) recording, where we are able to 
identify macrostates from the physiological data 
that correspond to mental states of the subject. 



1 Introduction 

The existence of correlations between psychologi- 
cal and physiological phenomena, especially brain 
processes, is the basic empirical fact of psycho- 
physiological research. Relations between mental 
processes, including modes of consciousness, and 
those occurring in its physical "substrate", the cen- 
tral nervous system, are generally taken as a mat- 



ter of course: They form the basis for the use of 
drugs in the treatment of mental disorders in psy- 
chiatry, they are applied as a research tool to shed 
light on the details of psychological mechanisms, 
and the explication of the neural structures under- 
lying mental fimctionrng forms the subject of cog- 
nitive neuroscience. Still, it remains unclear what 
the nature of the observed correlations is and what 
exactly is to be conceived as a neural correlate of a 
psychological phenomenonj^ 

One way to approach these issues is to inter- 
pret the mental as a domain emerging from an im- 
derlying physiological domain ( |Broad 1925 [Beck- 



ermann et al. 1992| . However, despite its long 



history reaching to recent scientific contributions 
(e.g. Darley' T994':'Seth' 2008), the term emergence 
is not very well defined and it is used in a large 



number of different meanings (cf. Stephan} 2002 
O'Connor and Wong[[2006| . 



In our understanding, emergence is a relation 
between different descriptions of the same system. 
In this view, the occurrence and correlation of psy- 
chological and physiological phenomena is due to 
the fact that the object of psychophysiological re- 
search (the research subject) can be approached 
and examined in different ways. More specifically, 
emergence is to be conceived as a relation between 
different descriptions each of which is useful or 
adequate in its own manner The question arises 
how there can be more than one adequate descrip- 



^ While the recent discussion focuses on the notion of "neu- 
ral correlates of consciousness" (cf. Metzinger 2000 1, we are 
interested in psychophysiological correlations in general. 



1 



tion for the same system, and what has to be the 
nature of their relation in order to permit this|^ 

In this paper we present one possible answer to 
these questions, motivated by ideas on the emer- 
gence of mental states from neurod5mamics intro- 
duced by Atmanspacher and beim Graben| ( [2007 ), 
where the two descriptions take on the form of a 
dynamical system. We introduce the further spec- 
ification that the relation between the two asso- 
ciated state spaces is characterized by a Markov 
coarse-graining (Sec.|2j, which leads us to consider 
metastable states as a particular form of emergent 
states. In order to demonstrate the practical viabil- 
ity of these ideas, we develop a method to identify 
metastable states from empirical data (Sec.|3]l, and 
illustrate the operation of the algorithm using data 
from a simulated system (Sec. |4|. In the applica- 
tion of the method to a recording of brain electrical 
activity, we are able to identify states closely cor- 
responding to the mental states of a subject, based 
on the analysis of the EEG data alone (Sec.|5|. 

2 Emergence in dynamical sys- 
tems 

A descriptive approach that has proven very fruit- 
ful in physics and other fields of the natural sci- 
ences is utilizing the concept of a dynamical system 
|Robinsonjjl995^|Chan and Tong 2001| . Such a de- 
scription is formulated with respect to the states 
the system can assume, and a djmamical rule that 
defines the way the state of the system evolves 
over time. The possible system states form a state 
space, which in the most general case is just a set 
of identifiable and mutually distinguishable ele- 
ments 



^Note that we are concerned with the atemporal or "syn- 
chronous" structure of such a relation between descriptions, 
and do not address the question of how a phenomenon 
emerges "diachronically", in a process unfolding in time. 

■^This is in accordance with the concept of system states in 
cybernetics and related disciplines (cf . Ashby 1962 1, but is at 
variance with the use of the term in physics where a state space 
is generally taken to be spanned by a set of observables (prop- 
erties that can be precisely quantified). Such a less structured 
concept of state space is useful because it also covers cases 
where it is not obvious how to endow that space with a for- 
mal structure, for instance mental states. However, as |Gaveau| 
[and Schulman] (2005) 1 point out, introducing into a state space 
a dynamics in the form of transition probabilities (see below) 
implicitly provides it with a metric structure. 



For a well-defined relation between two such 
descriptions to hold, it is necessary that the two 
state spaces can be related to each other|^ A sim- 
ple possibility is that the system assumes a par- 
ticular state in one description exactly if it is in 
any out of a certain set of states of the other de- 
scription; that is to say, one state space is a coarse- 
graining or partition of the other state space. Be- 
cause of this asymmetry between the two descrip- 
tions one may speak of a higher-level and a lower- 
level description, and refer correspondingly to ma- 
crostates and microstates of the system. The classic 
example in physics for this kind of inter-level re- 
lation is that between the phenomenological the- 
ory of thermodynamics, dealing with the macro- 
states of extended systems defined in terms of ob- 
servables such as temperature and pressure, and 
the theory of statistical mechanics, relating them 
to microstates defined in terms of the constituents 
of those systems]^ 

The description of a system is chosen by an ob- 
server, but it is also subject to objective constraints 
insofar as different descriptions may be differently 
adequate or useful. For a description as a dynam- 
ical system, the adequacy of a particular set of 
system states becomes apparent in the possibility 
to find a dynamical rule, O^t/ whereby the current 
state Xt of the system determines its further evolu- 
tion, 

here t is a continuous or discrete time variable and 
At a time interval. A particular state space defi- 
nition may therefore be called dynamically adequate 
if the specification of a state implies all the avail- 
able information which is relevant for determin- 
ing subsequent states, that is, if in this description 
the system possesses the Markov property (cf . ShaP] 
izi and Moore} 2008| . In this sense, the most gen- 
eral model of a d5mamical system is the Markov 
process — a stochastic model which includes deter- 



ministic dynamics as a limiting case (cf . Chan and 



This of course does not have to be the case; different de- 



scripti ons of the same system may also be incompatible with 
each other. 



■'In t his context the terms macrostate and microstate derive 
from the circumstance that they refer to the properties of a 
macroscopic" system versus those of its "microscopic" con- 
stituents. Though these terms often imply a difference in spa^ 
tiotemporal scale, the important point is the difference in the 
amount of detail given by the descriptions. 
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Tbngl[200T) . 

It is important to note that this criterion for 
selecting a descriptive level implies a reference 
back to that same level; while employing a more 
fine-grained set of states may serve to improve 
the prediction of the future of a system in gen- 
eral, it will in most cases result in a loss of 
the Markov property with respect to these finer- 
grained states themselves. In other words, the 
Markov-property criterion distinguishes descrip- 
tive levels at which the system exhibits a self- 
contained dynamics ("eigendynamics"), indepen- 
dent of details present at other levels]^ 

This specification of the kind of descriptions 
sought for leads to a more specific concept of 
emergence as an inter-level relation]^ Given a mi- 
croscopic state description exhibiting the Markov 
property, an adequate higher-level description or 
coarse-graining should ideally preserve it. In the 
context of deterministic nonlinear systems, where 
the d5mamics is defined by a map from a metric 
space onto itself, such a coarse-graining is called a 



Markov partition i jAdler} |1998{ |Bollt and Skufca 
2005| l; for the general case of stochastic dynam- 



ics we propose the term Markov coarse-graining (cf . 
Gaveau and Schubnan 2005) . Accordingly, states 



of a higher-level description may be called dy- 
namically emergent states if they correspond to a 
Markov coarse-graining of a lower-level dynam- 
ics. 

Interpreting psychophysiological correlations as 
reflecting a relation of emergence between two 
levels of description as a dynamical system, the 
lower-level description is stated in terms of phys- 
iological, neural states, the higher-level descrip- 
tion in terms of mental states. At both levels a 
wide variety of descriptive approaches is possi- 
ble, depending on the experimental methods used 



^This concept is akin to the idea of operational closure or 
autonomy in the theory of autopoietic systems iMatura na and| 
|Varela[[l980||Varela||1979 \, which alongside the separation from 
the environment also refers to the indifference of system oper- 
ations towards the internal "microscopic" complexity of sys- 
tem elements (cf. Luhmann. 1996 1. However, the topic of self- 
defined system boundaries is not addressed in this paper and 
accordingly, the term "system" is used in the unspecific sense 
of a section of reality which h as been chosen for observation 



''The stability conditions of Atmanspacher and beim Graben 
j2007[ are here realize d by the Markov property, while contex- 
tual constraints I Bishop and Atmanspacher 2006 j can be seen 
effective in the selection of a particular descriptive level out of 
those admissible. 



to assess the brain state on the one hand (elec- 
trophysiology, imaging methods, brain chemistry, 
etc.) and the chosen set of psychological cate- 
gories on the other hand (conscious / unconscious, 
sleep stages, moods, cognitive modes, etc.)|^ Ap- 
plying the djmamical specification of emergence 
outlined above, emergent macrostates that are de- 
fined via a Markov coarse-graining of the neu- 
ral microstate dynamics are candidates for a fur- 
ther characterization as mental states. In order to 
empirically substantiate these ideas, macrostates 
obtained from the djmamics that has been ob- 
served in neurophysiological data are to be related 
to mental states of subjects that have been deter- 
mined by other means, such as behaviorial assess- 
ment or verbal reports. 

In the following we undertake first steps to- 
wards this program. Since a general algorithm for 
finding Markov coarse-grainings is not known, we 
focus on the special case of metastable states. Be- 
cause a system stays in such a state for prolonged 
periods of time and only occasionally switches 
into another, antecedent states provide practically 
no information on the subsequent evolution be- 
yond that implied in the current state, so that 
the macrostate djmamics is approximately Marko- 
vian. The following section describes an algorithm 
to obtain metastable states from the microstate dy- 
namics observed in empirical data. 



3 Identifying metastable macro- 
states from data 

Metastable states correspond to the "almost in- 
variant sets" of a djmamical system, i.e. subsets of 
the state space which are approximately invariant 
under the system's djmamics. Since we are deal- 
ing with empirical data where there is generally 
no precise theoretical knowledge of the dynamics, 
it has to be determined from the data. 

Via a finite set of microstates resulting from a 
discretization of the state space (Sec.|3.1|, the time 



**Since each mental state allows for multiple realizations at 
the neural level, mental states may be said to "supervene on" 
brain states (cf.|Kim||1993l — but this alone does not provide a 
sufficient characterization of their relation. Moreover, co ntrary 
to assumptions prevalent in the discussion (cf. |Chalmers[|2000) 
a neural correlates need not necessarily be realized in a partic- 
ular neural subsystem of the brain. 
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evolution operator O^t is estimated in the form of 
a matrix of transition probabilities P (Sec. |3.2| . Me- 
tastable states are then determined using an algo- 
rithm to find the almost invariant sets of a Markov 
process (Sec. |3.3| . Additionally, an estimate of 
the optimal number of macrostates is obtained 
via an analysis of the characteristic timescales of 
the dyn amics (Sec.|3.4). Our a lgorithm builds on 
work by|Deuflh ard and Weber|(|2005|, Gaveau and 



Schulman 



an idea of 



(|200 5), and Froyland (2005), and re-uses 
Allefeld and Bialonski| ( |2007| |. 



3.1 Discretization of the microstate 
space 

In order to represent the observed microstate dy- 
namics as a finite-state Markov process, the state 
space defined by K variables {xi,X2, ■ ■ ■ = x 
has to be discretized, resulting in a set of com- 
pound microstates which forms the basis for fur- 
ther analysis. Since the data set may be high- 
dimensional and of varying density in different 
areas of the state space, we need a flexible algo- 
rithm which adapts the size and shape of micro- 
state cells to local properties of the distribution of 
data points. 

This procedure has to meet two competing 
goals: It should capture as much detail as possible 
in order to faithfully represent the underlying con- 
tinuous d5mamics within its discretized version; 
but since transition probabilities between cells are 
to be estimated, the number of data points per cell 
should not fall below a certain minimum. More- 
over, the extensions of the cells in the directions 
of the different variables should be of roughly the 
same size|3 

To achieve this, we use a recursive bipartition- 
ing approach (Fig. [T|: For a given set of n data 
points S = {xm}, m = l...n, the direction of 
maximal variance is determined, i.e. a unit vec- 
tor e, |e| = 1, such that var^ (x,„ ■ e) obtains its 
maximum value. Using the median M of the data 
points' positions along this direction as a thresh- 



'We assume at this point that the variables spanning the 
state space permit a comparison of distances along different 
directions. Where this is not the case it is advisable to map 
all variables onto the same range of values before performing 
the discretization. 




Figure 1: Discretization by recursive bipartitioning, il- 
lustrated with a set of data points drawn from a two- 
dimensional normal distribution stretched out along the 
main diagonal. Cuts occurring earlier in the procedure 
are indicated by thicker lines. 



old value, the set is divided into two subsets. 



Si 

S2 



{Xn 



Xm -e < M}, 
Xm ■ e > M}. 



The procedure is repeated for each of the resulting 
subsets, up to a recursion depth of b steps. This 
algorithm leads to a practically identical number 
of data points per cell (either [n/2^\ or [n/2'']) 
which can be adjusted via the parameter b. It pro- 
vides a high level of detail in those areas of the 
state space where the system spends most of the 
time, and it avoids too elongated cells by applying 
cuts perpendicular to the current main extension. 

3.2 Microstate dynamics 

Via the bipartitioning procedure, each data point 
Xru (tji = 1 . . . n) is assigned to one out of a finite set 
of microstates, identified by an index }i E {1 . . . N} 
(N = 2^). The observed sequence of data points 
(where the index m enumerates samples taken at 
consecutive time points) is thereby transformed 
into a sequence of microstate indices pim. Consid- 
ering this sequence of compound microstates as a 
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realization of a finite-state Markov process, the un- 
derlying d5mamics is described by a discrete trans- 
fer operator P, an N x N-matrix of transition prob- 
abilities between states, 

Pjj = Fr{ii,n+i = i I }lm = ;')' 
which may be estimated according to 



reversibilized process R, 



D' Ci'j 



where 



#{^ 



m+l 



iAfim= i) 



is the number of observed transitions from state j 
to state i. 

We assume that the Markov process described 
by P is irreducible, i.e. that it is possible to reach 
any state from any other staterj If this is not the 
case, the system has not only almost invariant but 
proper invariant sets, each forming an irreducible 
process of its own which itself may be subjected to 
a search for almost invariant subsets 13 We assume 
moreover that the process is aperiodic, which is 
already the case if only one diagonal element P„ 
is different from zero. For a finite-state Markov 
process these two properties amount to ergodicity, 
which implies that there exists a unique invariant 
probability distribution tt over microstates, with 
P K = K, which is also the limit distribution ap- 
proached from every initial condition. 

The analysis of the d5rnamical properties of a 
Markov process leading to the identification of its 
metastable states is strongly facilitated if it is re- 
versible, i.e. if the djmamics is invariant under 
time reversal: Py TZj = Pjj tzi for all This prop- 
erty cannot usually be assumed for an arbitrary 
empirically observed process. But since the prop- 
erty of metastability the tendency of the system 
to stay within certain regions of the state space for 
prolonged periods of time, is itself indifferent with 
respect to the direction of tune (Froyland 2005|, 



we can base the search for the corresponding al- 
most invariant sets on the transition matrix for the 



^"We use the terminology and results of|FellerHl968), Ch. XV. 
Another possible problem is that there may be states a tran- 
sition into or out of which is never observed, because they 
only occur at the beginning or end of the given data segments. 
Along with the general possibility of transient states, this is- 
sue is resolved in a natural way by the reversibilization step 
described below. 



1 / Pji Tli 



instead. This operator can be directly estimated 
according to 



i.e. by counting transitions forwards and back- 
wards in time, and the corresponding invariant 
probability distribution determined as 



TT: = 



Lj{Cij + Cji) 



In the following we will use the symbols R and 7t 
to denote these estimated quantities. 

3.3 Almost invariant sets 

To identify almost invariant sets we employ the 
PCCA+ algorithm which was developed by^Deufl^ 
[hard and Weber| ( |2005[ l to find metastable states in 
the conformation dynamics of molecules. In this 
section we outline the main ideas of that approach 
which are necessary to understand the operation 
of the method, while for further details on the im- 
plementation and mathematical background the 
reader is referred to their paper. 

Our starting point is the ergodic and reversible 
Markov process characterized by the N x N- 
transition matrix R along with its invariant proba- 
bility distribution Tt. Due to the reversibility of the 
process (symmetry of the stationary flow K/y TZj), 
the left and right eigenvectors Aj^, pj. and eigen- 
values Ajt of R, 

Afc R = Afc A,„ Rp^ = Xt,p,^, k=l...N, 

are real-valued. Resolving the scaling ambiguity 
left by the orthonormality relation Aj^ — S/^i by 
the choice pn- = tt, Aj., leads to the normalization 
equations 



1 and ^i^ki = 1' 
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and the transition matrix can be given the spectral 
representation 

k 

We assiime that the eigenvalues are sorted in de- 
scending order, Ai > A2 > . . . > Ajv- The unique 
largest eigenvalue Aj = 1 belongs to the invari- 
ant probability distribution, — rr, while the 
corresponding left eigenvector has constant coef- 
ficients. Ay — 1. 

If the process R possesses q almost invariant 
sets, it can be seen as the result of a perturbation 
of a process R that possesses q perfectly invari- 
ant sets. Any (normalized) element of the right 
eigenvector subspace of R belonging to eigenval- 
ues Ai = . . . = Aq = 1 gives an. invariant proba- 
bility distribution of the process. Moreover, if the 
invariant sets are described by characteristic func- 
tions (/ = 1 . . . ^), such that xi{i) = 1 if state i 
belongs to invariant subset /, otherwise, then any 
linear combination of them is a left eigenvector of 
R for eigenvalue 1. Conversely, from any linearly 
independent set of left eigenvectors for eigenvalue 
1, {Ai, A2, . . . , Aq}, the characteristic functions of 
the invariant sets can be recovered via suitable lin- 
ear combinations. 

Through the perturbation, the multiple eigen- 
value 1 becomes a cluster of large eigenvalues 
Ai,A2, ...,A^ close to Ai — 1, and the invariant 
sets become almost invariant sets. They are de- 
scribed by almost characteristic functions Xii^)' at- 
taining values in the range [0, 1] which may be in- 
terpreted as quantifying the degree to which state i 
belongs to almost invariant set /. In analogy to the 
ujnpertujrbed case, these fvinctions are constructed 
as linear combinations of the left eigenvectors be- 
longing to the q large eigenvalues, 

<? 

Xl{i) = Ylo<-kiAu, l^l...q, 

k=l 

defined by coefficients a — {0.^1)- Admissible are 
those regular transforms that conform to the con- 
straints 

• partition of unity: Yli Xl (0 — 1 for all i, and 

• non-negativity: Xli}) > for all i,l- 



The PCCA+ algorithm optimizes the transform « 
with respect to an objective function to be maxi- 
mized; we here choose the maximum scaling func- 
tion 

J (a) = ^max;t/(i), 

which favors attributions of states i to almost in- 
variant sets / that are as clear-cut as possible. 

The input data for the optimization are the dom- 
inant left eigenvectors Aj^, k = 1 . ..q. The first 
eigenvector is trivially Ai = (1, . . . , 1), but the re- 
maining eigenvector coefficients can be geometri- 
cally interpreted as attributing to each microstate i 
a position in a — 1) -dimensional left eigenvector 
space with position vectors 

o(0 = (^fcO' k = l...q. 

Within this space, the optimization procedure ap- 
pears as fitting a (j-simplex as closely as possible 
around the microstate points. In a system with 
pronounced metastable macrostates each of them 
appears as a cluster of microstates located at the 
boundary of the point cloud, and the optimization 
procedure matches these q clusters to one of the 
vertices of the tj-simplex. The values of the almost 
characteristic functions Xl (0 f^^^i attain the geo- 
metric meaning of barycentric coordinates of the 
data points with respect to the locations of the sim- 
plex vertices i?;: 

H 

o{i) = Ea/(0 

1=1 

Finally, metastable macrostates corresponding 
to almost invariant sets of microstates are identi- 
fied by attributing each microstate / to that ma- 
crostate I e {1 ... (j} for which the almost charac- 
teristic fimction xi (0 attains the highest value (or, 
to whose defining vertex it is closest in terms of 
barycentric coordinates). 

3.4 Macrostates and timescales 

If no prior information on the number of metasta- 
ble states to be identified is available, it is desirable 
to obtain an estimate from the data set itself. Since 
the existence of q almost invariant sets leads to q 
large eigenvalues, a criterion based on gaps in the 
eigenvalue spectrum is the natural choice. 
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However, the concrete values in the spectrum of 
R depend on the step size of the underlying dis- 
crete time, which is implicitly given with the input 
data. Changing the timescale from 1 to t steps, the 
process has to be described by the transition ma- 
trix R^, whose spectral representation 

k 

is essentially the same as that of R, but with eigen- 
values raised to the power t. 

The question which eigenvalues or which gap 
in the eigenvalue spectrum is to be considered 
"large" therefore depends on the chosen timescale. 
As ,Gaveau and Schulman, (|2005j note, a coarse- 
graining of the state space always implies a cor- 
responding "coarse-graining" or rather change of 
scale with respect to the time axis. 

We propose^ a measure of the size of spectral 
gaps that is invariant under a rescaling of the time 
axis. This is achieved by transforming eigenvalues 
into associated characteristic timescales. 



T{k) 



and introducing the timescale separation factor as the 
ratio of subsequent timescales: 



F{k) 



T{k) _log|A,+i| 



T{k + 1) log|A,| 



Substituting AJ. for Aj. in this equation, the result- 
ing factors cancel out, so that F{k) provides a mea- 
sure of the spectral gap between eigenvalues Aj. 
and Aif^i that is independent of the timescale. 

Using this measure, the number of macrostates 
q is estimated as the value of k for which F{k) be- 
comes maximal. The choice q = 1 leading to a 
single macrostate comprising all microstates has 
thereby to be excluded, because it is always asso- 
ciated with the largest timescale separation factor, 
F(l) 00. 

If several larger gaps exist, a ranking list of pos- 
sible (j-values may be compiled, where each value 
leads to a different possible coarse-graining of the 
system into macrostates. This way different lay- 
ers of the system's djmamical structure are recov- 



ered, which (extending 'Deuflhard and Weberfs ap- 
proach) may be considered as the result of multi- 
ple superimposed perturbations. An example of 
this is given in the following section, where the 
method is illustrated using data from a simulated 
system. 



4 Example: A system with four 
metastable macrostates 

To illustrate the operation of the algorithm we ap- 
ply it to data from a simulated system, where we 
can interpret the analysis results with respect to 
our precise knowledge of the underlying dynam- 
ics. We use a discrete-time stochastic system in 
two dimensions, {xi,X2), where the change over 
each timestep is given by 



2See 



Allefeld and Bialonski| |2007| for a very similar ap- 



proach in a ditterent context. 



with a = 0.01, (^1,^2) standard normal two- 
dimensional white noise, fci = 0.03, and b2 = 0.05. 
The first term of the right hand side of this equa- 
tion describes an overdamped movement within a 
double-well potential along each dimension, lead- 
ing to four attracting fixed points at (xi,X2) = 
( ± 1 / Vl, ±1/V2). Without the stochastic second 
term, the system would be decomposable into four 
invariant sets, separated by the two coordinate 
axes. But due to the noise the system performs 
a random walk, staying for prolonged periods of 
time in the vicinity of one of the attracting points, 
but occasionally wandering into another point's 
basin of attraction. These switches occur more fre- 
quently along X2 because the noise amplitude is 
larger in that direction, 1)2 > h-^. 

Data resulting from a simulation run of this sys- 
tem are shown in Fig. |2] A section of the con- 
necting trajectory illustrates how the system state 
moves through the state space, entering and leav- 
ing the cells of the microstate partition. Counting 
these transitions between cells leads to an estimate 
of the reversibilized transition matrix R. 

The largest eigenvalues of R are plotted in 
Fig. |3^, revealing a group of four large eigen- 
values (> 0.995), which itself is subdivided into 
two groups of two eigenvalues each. This pic- 
ture becomes clearer after the transformation into 
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-1.5 -1 -0.5 0.5 1 1.5 

Figure 2: Data points from a simulation rim of a system 
with four metastable macrostates over 10^ time steps, 
and a part of the connecting trajectory. Straight lines 
indicate the cell borders of the partition into 4096 micro- 
states obtained via the bipartition algorithm. 



Figure 4: Eigenvector space (01,02,03) of the system 
with four metastable macrostates for q = A. Each dot 
representing a microstate is colored according to which 
vertex of the enclosing tetrahedron is closest, defining 
the four metastable macrostates. 
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Figure 3: Eigenvalue spectrum of the transition ma- 
trix R of the system with four metastable macrostates. 
(a) The eigenvalues of largest magnitude, (b) Logarith- 
mic timescales; locations and values of the two largest 
timescale separation factors are indicated. 




Figure 5: Micro- and macrostates of the system with 
four metastable macrostates. Lines indicate the cell bor- 
ders of the partition of the state space into microstates, 
while the coloring of data points shows the attribution 
of microstates to one of the four metastable macrostates. 
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timescales T(fc). In Fig. [sj? they are displayed 
on a logarithmic scale, such that the magnitude 
of timescale separation factors F{k) becomes di- 
rectly visible in the vertical distances between sub- 
sequent data points. The largest separation factor 
is F(4) = 5.02, closely followed by F(2) = 4.55, in- 
dicating that a partitioning of the state space into 
q = A macrostates is optimal, while searching for 
two different macrostates may also yield a mean- 
ingful result. 

The identification of almost invariant sets of mi- 
crostates defining the metastable macrostates is 
performed within the 3-dimensional eigenvector 
space (01,02,03). Fig. [4] reveals that the points 
representing microstates are located on a saddle- 
shaped surface stretched out within a 4-simplex or 
tetrahedron. The algorithm identifies the vertices 
of the tetrahedron and attributes each microstate 
to that macrostate whose defining vertex is clos- 
est, resulting in the depicted separation into four 
sets. 

In Fig.|5]this result is re-translated into the origi- 
nal state space of Fig.|2j by coloring the data points 
of each microstate according to the macrostate it is 
assigned to. The identified metastable states co- 
incide roughly with the basins of attraction of the 
four attracting points, i.e., the almost invariant sets 
of the system's dynamics. 

From Fig. |4] we can also assess which macro- 
state definitions would be obtained by choosing 
q = 2, the next-best choice for the number of 
metastable states according to the timescale sep- 
aration factor criterion. In this case the eigenvec- 
tor space is spanned by the single dimension Oj, 
along which the two vertices of the tetrahedron on 
the left and right side, respectively, coincide. This 
means that the two resulting macrostates each con- 
sist of the imion of two of the macrostates obtained 
for q = 4. With respect to the state space, these 
two macrostates correspond approximately to the 
areas xi > and xj < 0. 

This result can be understood from the system's 
d5mamics, since because of the smaller probabil- 
ity of transitions along these two areas of the 
state space form almost invariant sets, too. As can 
be seen from this example, the possibility to se- 
lect different (^-values of comparably good rating 
may allow to recover different dynamical levels of 
a system, giving rise to a hierarchical structure of 
potential macrostate definitions. 



5 Application to EEG data 

For the purpose of a first application of the al- 
gorithm to neurophysiological data, we chose an 
electroencephalographic (EEG) recording from a 
patient suffering from petit-mal epilepsy, a con- 
dition characterized by the occurrence of frequent 
short (several seconds) epileptic episodes, during 
which the patient becomes irresponsive (cf. Nie-| 
dermeyer 1993| . This kind of data is favorable 



for our methodological approach because we can 
expect two clearly distinct states to be present — 
"normal" EEG / mentally present and paroxys- 
mal episodes / mentally absent — , and because it 
is possible to observe many transitions between 
these states in a recording of moderate size. 

The data set consists of a section of 89min 
length from the patient's monitoring EEG. It was 
recorded from the 19 electrode positions of the 
international 10-20 system dAmerican Electroen 



cephalographic Society||199l| at a sampling rate of 
250 Hz, digitally bandpass-filtered (2-15 Hz), and 
transformed to the average reference. Due to ar- 
tifact removal by visual inspection the amount of 
data available for analysis was reduced to 71 min 
total length (1 064435 data points). 

In the preceding simulation example we know 
by definition that the given values of the system 
variables immediately specify its dynamical state, 
and therefore can be directly processed by the al- 
gorithm for the identification of metastable states. 
With measurement data like EEG the situation is 
not so clear. The data "as is" may be accepted 
as a specification of the system state, but any fur- 
ther processed version of them fulfills this func- 
tion as well and may for some reason be even more 
suitable. This means that empirical data pose the 
problem of how to define the input state space for 
the analysis. 

For low-dimensional nonlinear deterministic 
dynamical systems techniques have been devel- 
oped to reconstruct the state space of the system, 
or a higher-dimensional space comprising it, from 
scalar time series via the method of time-delay 



embedding (Takens 1981 Kantz and Schreiber 



1997|. However, these techniques are not appro- 
priate for our purposes. Firstly, the data-set is 
already multi-dimensional and using the embed- 
ding approach we would have to either blow up 
the dimensionality even more, thereby introduc- 
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ing a high amount of redundance, or discard many 
of the input data channels, possibly loosing crucial 
information. And secondly, previous attempts to 
demonstrate low-dimensional nonlinear structure 



in EEG data had only limited success (cf. Theiler 
and Rapp[p^|Palus[p^96). 



Instead, we pursue the following strategy: In a 
first step, we use the original 19-dimensional data 
space as the input state space. Guided by the re- 
sults obtained in this way as well as by indepen- 
dent observations on the behavior of multichannel 
EEG, in a second step we develop a preprocess- 
ing procedure defining a more abstract input state 
space. 

5.1 Original data state space 

Using the recursive bipartitionrng algorithm, the 
data points were assigned to 32 768 different com- 
pound microstates (32 or 33 points in each cell). 
The resulting timescale spectrum (Fig.|6^) exhibit- 
ing a large separation factor F(3) = 2.15 suggests 
a search for three metastable macrostates in a two- 
dimensional eigenvector space (Fig. |6j3). This is 
supported by the 3-simplex shape of the distribu- 
tion of microstate positions within this space. 

The arrangement of the areas belonging to the 
identified macrostates in the input data space is 
shown in Fig.j^ where the 19-dimensional space is 
represented using the first three PGA components 
of the data. The most prevalent state accounting 
for about 99% of the data points appears here as a 
centrally located spherical area, with the two other 
states forming handle-Hke appendices at opposite 
sides. 

The role of these three macrostates becomes 
clearer considering the transitions between them 
over time, in comparison with the underlying 
EEG time series (Fig. |8^). Within periods of nor- 
mal electroencephalographic activity the system 
stays within the "main" macrostate, while during 
seizures switches between all three states occur 
regularly, corresponding to an oscillation along 
the PCI axis of Fig. |7] This macrostate d5mam- 
ics reflects the spike-wave oscillatory activity visi- 
ble in the EEG channels shown in the lower panel 
of Fig. |8j which are characteristic for paroxysmal 
episodes. 

With this first result, the attempt at identifying 
emergent macrostates using the EEG data space is 
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Figure 6: Analysis results for the original EEG data state 
space, a) Timescale spectrum; a large separation factor 
indicates three metastable states, b) Microstate positions 
in two-dimensional eigenvector space forming a trian- 
gular structure, and the resulting three metastable ma- 
crostates. 
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Figure 8: Analysis results illustrated using a segment of 40 s length. Upper panels: Macrostate dynamics over 
time, resulting from different input state space definitions, a) Original EEG data state space, b) Amplitude vector 
state space, c) As in (b), but using normalized amplitude vectors. Lower panel: EEG timeseries at four selected 
recording sites. Paroxysmal episodes are characterized by short bursts of spike-wave activity. 
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Figure 7: Analysis results for the original EEG data state 
space: Location of data points belonging to the three 
identified metastable states. The 19-dimensional state 
space is represented using the first three PCA compo- 
nents of the data. 



only partially successful: The occurrence of states 
correlates strongly with those features of the un- 
derlying process which are psychophysiological^ 
most important, and also most prominent in visual 
inspection of the data. However, the two states ex- 
pected are not directly recovered by the EEG anal- 
ysis. Instead of one persistent state during parox- 
ysmal episodes, we find rapid oscillatory changes 
between states including the one associated with 
normal EEG. This indicates that the input state 
space is not yet optimally defined. 

5.2 Amplitude vector state space 

This finding can be understood from the fact 
that electroencephalographic activity in general is 
so strongly shaped by a predominant oscillatory 
layer of the dynamics — not only during epilep- 
tic episodes but also in normal EEG, particularly 
in the form of the alpha rhythm — that it is hard 
to discern more subtle d5mamical features. To re- 
cover those features, we need a preprocessing step 
that eliminates the oscillatory character of the data 
but retains the more slowly changing parameters 
of the oscillation. 

As observed by Wackermann ( |1994| , the trajec- 
tory formed by multichannel EEG within the data 
space can be approximated by a movement along 
an elliptical orbit with slowly changing orienta- 
tion and shape (Fig. |9]|. By locally matching el- 
lipses to the data, a global instantaneous phase 
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Figure 9: Amplitude vector state space. The trajec- 
tory corresponding to a multivariate oscillatory signal 
like EEG takes on the form of an elliptical orbit with 
slowly varying parameters. Locally matching ellipses to 
the trajectory, the instantaneously dominant oscillatory 
component can be characterized by the major semiaxis 
vectors (straight radial lines), resulting in a description 
of the system's oscillatory state which itself evolves in a 
non-oscillatory way (black curve). 
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Figure 10: Analysis results for the amplitude vector 
state space, a) Timescale spectrum indicating the pres- 
ence of two metastable states, b) Microstate positions 
in one-dimensional eigenvector space and the resulting 
two metastable macrostates. 



and amplitude can be defined, where the ampli- 
tude corresponds to the two main semiaxis vectors 
of the ellipse. (For a full account of the calculation 
see App. |A]) For simplicity we only use the ma- 
jor semiaxis vector, which specifies the direction 
and strength of the momentarily dominant oscil- 
latory component, to define an amplitude vector 
state space as the input state space for the algo- 
rithmic] 

With the specification of the system state via the 
major amplitude vector an ambiguity arises, be- 
cause vectors of opposite orientation are equiva- 
lent. This is resolved by enforcing positive sign 
for the first vector component during the assign- 
ment of data points to microstates. For visualiza- 
tion (Fig.[TT| the axis vectors are used as they come 
out of the calculation described in App. |AJ that is 
with basically random orientation. 

The resulting timescale spectrum is shown in 
Fig. 10 1. The largest separation factor of F(2) = 



4.23 now gives a more definite indication of the 
number of macrostates than for the original data 



^^This approach is similar to one of the strategies employed 
in the "spatial analysis" of EEG jLehmann||1987| , to select only 
those EEG potential maps (data vectors) which occur at local 
maxima of the "global field strength" (the norm of the data vec- 
tors). 
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Figure 11: Analysis results for the amplitude vector 
state space: Location of data points belonging to the 
two identified metastable states in a representation of 
the state space using the first three PCA components. 



state space. In the corresponding one-dimensional 
eigenvector space (Fig. [T0| 3) the two macrostates 
are trivially defined by a cut at the center of the 
range of values. 
In Fig. 
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the location of data points belong- 
ing to the two macrostates is shown using the first 
three PCA components of the data points in the 
amplitude vector state space. Again, a prevalent 
macrostate (accoimting for 98% of the data points) 
fills a spherically shaped central area, while two 
appendices protruding on opposite sites together 
constitute the second macrostate. Despite the fact 
that the overall shape of the data cloud is similar 
to that shown in Fig. |7j the reader should keep in 
mind that the two diagrams depict differently de- 
fined state spaces represented with respect to a dif- 
ferent set of dimensions. 

Fig. [Sj? demonstrates that the revision of the in- 
put state definition successfully eliminates the os- 
cillatory switching between states during paroxys- 
mal episodes. Starting from the amplitude vector 
input state space, the algorithm for the identifi- 
cation of metastable states is able to consistently 
associate normal EEG with one macrostate, and — 
except for short relapses — epileptic EEG with an- 
other macrostate. 

The macrostate structure of the amplitude vec- 
tor state space shown in Fig. [Tl] suggests that 
the distinction of the two macrostates relies only 
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Figure 12: Timescale spectrum obtained using normal- 
ized axis vectors. The largest separation factor (for six 
macrostates) is only marginally larger than the other oc- 
curring values, indicating that no adequate definition of 
metastable states is possible. 



on the length of the amplitude vector. To check 
this, we tested the performance of the algorithm 
when normalized amplitude vectors are used. The 
timescale spectrum (Fig. [12), with a maximal sep- 
aration factor F(6) = 1.21 not substantially larger 
than the rest, indicates that the identification of 
macrostates is severely impaired under these cir- 
cumstances. Even so, an examination of the state 
dynamics over time (Fig. |8j:) reveals that there are 
still two states that are mainly attained during 
epileptic episodes. 



6 Conclusion 

Relations between mental (psychological) and 
neural (physiological) phenomena form the gener- 
ally accepted basis for work in various disciplines 
such as psychiatry, psychophysiology, and cogni- 
tive neuroscience. While a large body of knowl- 
edge has been gathered in these fields, the con- 
ceptual question of how mind and brain are re- 
lated in precise terms is still largely unresolved. 
Starting from the notion of the mental as "emerg- 
ing" from neural processes, we argue that this re- 
lation of emergence should be understood as one 
between different descriptions of the same system. 

Utilizing concepts from the theory of dynam- 
ical systems for the formulation of descriptions, 
we propose that the relation between descriptive 
levels should take on the form of a partition or 
coarse-graining of the state space that is character- 
ized by a preservation of the Markov property. To 
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empirically test the validity of our approach, we 
turn to a form of such a Markov coarse-grainining 
which can be algorithmically obtained: that of me- 
tastable states. We describe how metastable ma- 
crostates of a dynamics observed in empirical data 
can be identified based on the spectral analysis of 
the transition matrix governing the microstate dy- 
namics, and illustrate its operation with simula- 
tion data. 

We apply the method to a recording of elec- 
troencephalographic (EEG) data from a human 
subject suffering from petit-mal epilepsy. Com- 
bined with a suitable preprocessing procedure, the 
algorithm is able to automatically identify meta- 
stable states from the data which closely corre- 
spond to the mental states of the subject (mentally 
present / absent). This first application substanti- 
ates the practical viability of our approach and ap- 
pears promising for the future application of the 
method to more challenging forms of data. 

Finally we want to point out that the concept 
of metastable macrostates in the application to 
EEG data is similar to the notion of "brain func- 
tional microstates" introduced by Lehmann and 
co-workers, which are defined as brief periods 
of time during which the spatial distribution of 
the brain's electrical field remains relatively sta- 
ble ( Lehmann||1971) iLehmann et al. , 1987|. Transi- 



tions between such states are characterized by an 
abrupt change of the field topography, allowing to 
decompose the stream of EEG data into segments 
of the order of magnitude of 10-100 ms duration 
which can usually be grouped into a small num- 
ber (< 10) of classes. Note however that the "mi- 
crostate analysis" of Lehmann et al. results in a 
coarse-grained description of the brain's electrical 
activity, i.e. in our nomenclature, a definition of 
macrostates. 
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A Instantaneous amplitude and 
phase for multivariate time- 
series 

The local oscillatory behavior of a real-valued uni- 
variate signal x(f) is commonly characterized us- 
ing the co rresponding c omplex-valued analytic 
signal z{t) (Gabor 1946 1. It is obtained by com- 
bining x{t) with an imaginary part, 

z{t) =x{t)+iy{t), 

which is defined as the Hilbert transform of x, 

y(i) = Hx(0=-P.v./ j^dt', 

where PV. denotes the Cauchy principal value of 
the integral. Under the condition that x(f) is domi- 
nated by a single frequency component, its instan- 
taneous amplitude A(t) and phase <p(t) can be de- 
termined via the analytic signal according to 



so that 



or 



A(t) = \z{t)\, <|>(0 =argz(0. 



x{i) = A{t) cos (p{t) 



zt) 



A{t) exp(i</>(0). 

The terms amplitude A and phase ^ as they are 
used here can be interpreted such that they spec- 
ify the parameters of a strictly periodic sinusoidal 
oscillation which locally matches the behavior of the 
observed signal x{t) at a given instant t. In partic- 
ular, (p{t) attains the value (or equivalently, an 
integer multiple of 2n) whenever the actual value 
of x{t) coincides with the associated instantaneous 
amplitude A{t). 

These properties of the analytic signal can also 
be utilized to determine the parameters of the lo- 
cally matching oscillation for a multivariate sig- 
nal x{t) = (x/(f)) (f = 1 . . . K). We assume that 
each component signal x, (f ) is dominated by a sin- 
gle frequency and that the frequencies of differ- 
ent signals are similar. Using y{t) to denote the 
channel-wise Hilbert transform of x{t) and z{t) for 
its channel-wise completion to the analytic signal, 
the local extension of the signal's oscillatory be- 
havior for instant t is obtained with 

zt{e) =z{t) exp(i0). 
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Figure 13: Determination of local ellipse axes. The 
trajectory formed by the multivariate signal is locally 
matched to an elliptical orbit, which is defined by the 
data vector x at a given instant and the corresponding 
vector y from the signal's charmel-wise Hilbert trans- 
form as conjugate semiaxis vectors. Main semiaxis vec- 
tors of the ellipse, a and b, are obtained using the asso- 
ciated multivariate instantaneous phase (p. 

parametrized by 9 E [0,2tz]. Its real part 

xt{e) = x{t) cose -y{t) sine 

gives the multivariate oscillation that locally 
matches the behavior of the signal at instant t; its 
trajectory is an elliptical orbit with conjugate axes 
specified by the vectors x{t) and y{t). 

From these conjugate axes, the main semiaxis 
vectors a{t) and b{t) of the local ellipse can be cal- 
culated (Fig. [T3| . It proves useful to do so via in- 
troducing a global (channel-independent) instan- 
taneous phase (p{t), such that for (p{t) E {0, j/r, 
TT, jTt} or equivalents, x{t) coincides with one of 
the main semiaxis vectors or its negative. This is 
achieved choosing 



arctan 



x{t)\^-\y{t) 



Since the resulting values in the range [— ^,^] 
cover only one quarter of a cycle, the outcome may 
be transformed into an equivalent but more useful 
representation via a standard "unwrapping" pro- 
cedure (adding or subtracting y at discontinuity 
points) to enforce a smooth evolution of ^(f). 



Using this result, main semiaxis vectors of the 
locally matching ellipse at instant t are obtained 
by going backwards along Xt{e) by an amount of 
(p{t) or forwards by ^ — (p{t), i.e. 



and 



a{t)=xt{-cp{t)) 



Ht) = x,['^-cp{t)) 



If (p{t) has been adjusted for a smooth evolution 
over time, the same can be expected from the re- 
sulting fl(f) and b{t). It is, however, not clear from 
this definition which one of these vectors speci- 
fies the major and minor axis of the ellipse, respec- 
tively, and it is possible that over the course of time 
the two vectors change roles. For a specific appli- 
cation of this result, further processing may there- 
fore be necessary. 

Complementary to the generalization of the in- 
stantaneous phase concept, a multivariate instan- 
taneous amplitude A{t) can be defined such that 

z(0 = A(0 exp(i</)(0), 

which is given by 

A{t) = a{t) -ib{t). 

The channel-wise modulus of this quantity corre- 
sponds to the instantaneous amplitudes A,(f) of 
the component signals, while the argument com- 
prises the phase differences between the global 
and the component signal instantaneous phases, 
cl>i{t)-cp{t). 
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